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ABSTRACT 

We present an iterative algorithm for solving a class of 
nonlinear Laplacian system of equations in 0(fc^m log(fcn/e)) 
iterations, where fc is a measure of nonlinearity, n is the 
number of variables, m is the number of nonzero entries in 
the graph Laplacian L, t is the solution accuracy and 0{) 
neglects (non-leading) logarithmic terms. This algorithm is 
a natural nonlinear extension of the one by of Kelner et. 
ah, which solves a linear Laplacian system of equations in 
nearly linear time. Unlike the linear case, in the nonlinear 
case each iteration takes 0{n) time so the total running time 
is 0{k^mn\og{kn/e)). For sparse graphs where m = 0{n) 
and fixed k this nonlinear algorithm is 0{n^ log(n/e)) which 
is slightly faster than standard methods for solving linear 
equations, which require approximately 0{n^'^^) time. Our 
analysis relies on the construction of a nonlinear “energy 
function” and a nonlinear extension of the duality analysis 
of Kelner et. al to the nonlinear case without any explicit 
references to spectral analysis or electrical flows. These new 
insights and results provide tools for more general extensions 
to spectral theory and nonlinear applications. 

General Terms 

Graph Laplacian, Spectral analysis, Nonlinear equations 

1. INTRODUCTION 

The goal of solving linear systems of equations in nearly lin¬ 
ear time, spearheaded by Vaidya |10| . has propelled a rev¬ 
olution in spectral graph theory and several related fields. 
In addition to many important results about graph sparsi- 
fication (see [11] for overviews and references) this the¬ 
ory has led to two main types of methods of solving sets of 
symmetrical diagonally dominant linear equations in nearly 
linear time - the first using graph sparsification (and ul¬ 
tra sparsifiers) (e.g., [8]) combined with classical iterative 
methods, and the second which relies on a underlying low 
stretch spanning tree to provide an algorithm that relies on 
cycle updates for “electrical flows” in the underlying graph 
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[ 3 ]. In addition to having important applications in matrix 
computations, these results have led to a large number theo¬ 
retical results, reducing the complexity of several important 
graphical algorithms. 

The key insights underlying these approaches are the re¬ 
duction of a set of symmetrical diagonally dominant linear 
equations to a system of equations Lx = b arising from a 
graph Laplacian, L. Previous analyses (e.g., ID) rely on 
the linearity of the underlying system of equations and the 
use of a carefully chosen spanning tree on the underlying 
graph. Thus, one might expect that linearity is required 
for the analyses; however, as we will show, linearity is not 
necessary and there appears to be a more general struc¬ 
ture underlying these approaches which allows a certain de¬ 
gree of nonlinearity. More precisely, we show that one can 
generalize the graph Laplacian in a nonlinear manner and 
apply a nonlinear extension of the Algorithm in Kelner et 
al. [ 3 ] to solve them in 0(fc^m log(fen/e)) iterations and 
0(fc^mn log(fcn/e)) time. Our analysis relies on the con¬ 
struction of an “nonlinear energy function” and extension 
of their duality analysis to nonlinear Laplacian systems of 
equations. While providing new insights into the linear 
problem, this analysis shows that one can generalize graph 
Laplacians without losing much of the underlying structure 
and suggests further nonlinear generalizations. 

2. RELATED WORK 

The idea of using a spanning tree to precondition a set of 
linear equations comes from an unpublished presentation by 
Vaidya [ID]. That paper led to a large body of work over 
many years that led to Spielman and Teng’s seminal paper 
[8] which showed the theoretical existence of nearly linear 
time algorithms for systems of equations, using very sophis¬ 
ticated machinery dllllZlID]. 

Our algorithm is based on Kelner et. al’s algorithm [3], 
which dramatically reduces the amount of machinery re¬ 
quired. Its origins are clearly implicitly based on the more 
sophisticated machinery in the earlier papers, such as low- 
stretch spanning trees (e.g., [I] ) and ultra-sparsifiers (e.g., 
m), but does not require them directly. 

3. GENERALIZED MODEL AND RELAX¬ 
ATION 

We begin with an undirected weighted graph G =< V,E,w > 
with node set V, edges E and weights Wij > 0 where Wij = 
Wji. The graph Laplacian of G is given by L where Lij = 


—Wij for i ^ j and La — — '^ij ■ Given a vector 6 G SR" 
a Laplacian system of equations is simply Lx = b. Such a 
system always has a solution if ~ which we denote 

X* and is unique up to an additive constant. Throughout 
this paper whenever we talk about solutions, we will ignore 
the multiplicity of solutions due to the nullspace of L. Note 
that for notational clarity we will always assume that G is 
connected. In the case that G is not connected one can 
simply solve on each connected component separately. 

Let N{i) be the neighbors of node i in the graph G. It is 
well known that one can express the action of the Laplacian 
as 

{Lx)i= ^ Wij{xi-Xj), 
jejv(i) 

which only depends on Xi — Xj for {i, j) G E. In the following 
we will consider a generalization of this to non-linear graph 
Laplacians where 

{Lx)i^ ^ Wijhij{xi- Xj) (*) 

j€N(i) 

where the nonlinear function /lij(-) is anti-symmetric, satis¬ 
fies hij{—v) = —hij{v), is continuous, the derivative h'ij(v) 
is left-continuous, vh[j{v) is strictly increasing and 1/fe < 
h'ijiv) < k for a chosen k > 1. 

Note that the running time of our algorithm will depend on 
k and the computational complexity of In order to 

simplify the analysis with respect to hij{-), we assume that 
we have constant time oracles for computing 
and The more natural version where these oracles 

require 0(log(l/e)) computations, would only increase our 
running times by 0(log(l/e)) factors, where e is the accuracy 
of the solution. 

A simple example of this model arises when /I'j (v) = Ijk for 
|u| < 1 and h'ijiv) = 1 otherwise. This example arises when 
the cost is small for small flows, in contrast to the linear case 
where the “cost” of a flow between edges is linear in the flow. 
More generally, we can choose h'ijiv) to be constant except 
at a discrete set of points where it increases, corresponding 
to a piecewise linear hijiv) with discrete increases in slope, 
bounded by slopes 1/k and k. Surprisingly, one can also use 
an hij iv) that is not convex in the positive quadrant. For ex¬ 
ample the function given by hijiv) = u -I- arctaniv) satisfies 
the requirements and is not convex in the positive quad¬ 
rant, leading to“economies of scale” where the incremental 
cost can decrease as a flow increases. 

Following Kelner et. al [3] we can introduce new variables, 
Qij, which for generalized Laplacians allow us to rewrite 
Lx = h as: 

Mi £ V ■. ^ Wijhijigij) = bi iBCi) 

jejv(i) 

V(j,i) G E : Qij =Xi - Xj iPCi) 

which decomposes the problem in a natural way into b- 
constraints iBCi) and p-constraints (PCi), where a: is a 
‘potential function’ for g. We say that g is b-feasible if g 
satisfies the b-constraints and p-feasible for p-constraints. 


Formally, we will only have a single variable for each edge, 
written g^ for e £ E but for notational convenience we will 
write gij = —gji where one of these two variables as canon¬ 
ical and the other is simply a notational convenience. For 
example if j)<zE 9ii corresponds to a sum where each 
edge appears once, while in J2jeN(i) 9ij each g^ ap¬ 

pears twice. 

Note that our variables differ from those in Kelner et. al 
[5] and most of the literature in the linear case where in the 
electrical model our gij variables correspond to the voltage 
drop between nodes i and j while their hij corresponds to 
the current flow between those nodes. Algebraically, gij = 
hij j VJij . 

4. CYCLE UPDATES 

A key idea in the algorithm by Kelner et al [3] is to update 
the gij’s using cycles to maintain b-feasibility. Let C be an 
oriented set of edges in G that forms a cycle. Let aiC, t) be 
defined such that for all (i, j) G C, 

Wijhijigij +aijiC,t)) = Wijhijigij) +t, 

and aijiC,t) = 0 for (i,j) ^ C and note that 0(17,0) = 0 by 
construction. It can be easily checked that this procedure 
maintains b-feasibility for any value of t, since the increase 
from an incoming edge of the cycle through any node is 
canceled by the decrease in the sum from the outgoing edge 
that follows it. When hij)-) is the identity, aijiC,t) = t/wij 
as in the linear analysis. 

The key idea in the linear analysis is to iteratively update cy¬ 
cles by updating cycles using q:(C, t) in order to approach p- 
feasibility while always maintaining b-feasibility. The anal¬ 
ysis m relies on duality related to “electrical systems” with 
an energy function. In the nonlinear case we can also define 
an analogous energy function 

^( 5 ) 

ij 

where g is the vector of flip’s and 

f3ij ^ 

^iji9ij) — / ^hijis) ds. 

Jo 

In the linear case where hijiv) is the identity, this reduces 
to the standard energy function of [3], after a change of 
variables. 

The key insight in our analysis is that for any cycle C the 
“cycle adjustment”, a(C', t) that minimizes '!’((;) satisfies p- 
feasibility around the cycle. 


Lemma 1. If for some cycle C 

t* = argmirii^ig + q:(C', t)) 

then 

i9ij +Oiij{C,t*)) = 0. 

Proof. To see this note that $((?) is convex by the defi- 


nition of the /I’s and that the derivative is 


d^g + a{C,t)) 
dt 


E 

(i.j)eG 


dif^ij (^Qij OLij (C*) t) ) 

dt 


Summing over C and integrating over t yields 

^ aii{C,t)<ktlWc 

{i,j)ec 


(i,j)ec 

Inserting the definition of (f>ij into the right hand side of the 
equation yields 

= E {9ij+oiij {C, t))h'ij {gij+aij (C, t)) 

(ij)ec 

Implicitly differentiating the defining equation for a(C, t) 
with respect to t gives 

Wijh'ijigij + 

which combined with the previous equation and the dehni- 
tion of t* gives the desired result after setting the derivative 
to 0. □ 

This lemma implies our first theorem: 


Theorem 1. If g is a b-feasible flow that minimizes ^{g) 
over all “cycle updates” in G, then g is also p-feasible. 

Proof. Clearly if there are no cycle updates that de¬ 
crease the energy then by the preceding Lemma the flows 
around those cycles must sum to 0. □ 

We next derive a quantitative measure of the gain in energy 
from a cycle update. This will depend on both the initial 
flow around the cycle 

Gc = ^2, 9ii 

(ij)ec 

and the harmonic sum of the w’s around the cycle 

Wc = ( ^ 

(ij)ec 


Lemma 2. For a cycle C the decrease in energy from a 
cycle update satisfies: 

Hg) - t*)) > WcGlifiik). 


Proof. We assume that Gc < 0. (If not then simply 
traverse the cycle in the opposite direction.) As seen in the 
proof in Lemma [T] 


d^{g + a{G, t)) 
dt 


E i9it+<^i3iG,t)). 
(i,j)ec 


Integrating from t = t* to t = 0 yields 
^{g)-^{g+a{C,t*)) = {Gc+ ^ aij{G,t))dt (***). 




Note that the integrand is 0 at f = t*, Gc at t = 0 and is 
strictly decreasing over the range of integration. Using (**) 
we see that 

da^j{C, t) ^ aij{C,t)))~^ < k/wij 


Since Gc + 5I](i = 0 this implies that t* > 

—WcGc/k. Combining this with (***) yields 

Hg) - ^{g + a(C, t*)) > r [Gc + ktlWc)dt 

J — WcGc/k 

and integrating completes the proof. □ 

To compute the cycle update we can use Newton’s method 
on the derivative ^{g + a{G,t)), written in the form. 

Gc + ^2 

(ij)ec 

Note that the sum is bounded by {tlWc)lk and {t/Wc)k 
so we can use a binary search on the interval [0, {t/Wc)k]. 
Since the function is convex in t, for any t G [t*/4, 3t*/4] 
the value of the function will be reduced by at least a factor 
of 2. This will require 0{\og{k^)) iterations so the total 
time required is 0(n log(fc)) since each evaluation of the sum 
requires 0{n) time. 

Thus we have proven an approximate version of the previous 
lemma. 


Lemma 3. One can compute an approximate cycle update 
in 0(nlog(/c)) such that for any cycle G the decrease in en¬ 
ergy satisfies: 

<&(<?) - Hg + <^{c, t*)) > WcGliiAk). 

5. SPANNING TREES 

Given a spanning tree, T f- E one can analyze a simple set 
of cycles that span the cycle space of all cycles [^. Given 
an edge (i,j) £ E\T define the tree path P(i^j) to be the 
minimal oriented set of edges from node j to node i and let 
the tree cycle to be G^ij') — P(i,j) U(bi)- Thus, instead of 
minimizing 4>(^) over all cycles one only needs to minimize 
it over each of the tree cycles. 

As in [3] , we will choose edges e £ E \ T at random with 
probability pe and then minimize 4>((?) over the tree cycle 
Ge. The main difference here is that in the linear case one 
can minimize the energy over a cycle analytically while in 
the nonlinear case one needs to rely on an iterative method, 
which we described earlier. 

For later use, we define the tree-values of x, denoted x{T, g) 
to be those which satisfy gij = Xi — Xj on the tree. These 
are easily computed, up to an additive constant, using using 
a breadth first search from any leaf of the spanning tree of 
G. 


6. LAGRANGIAN DUALITY 

In order to analyze the progress made by each iteration of 
our algorithm we need an estimate of the distance to opti¬ 
mality. To do this we compute the Lagrangian dual for our 
minimization problem. 









First recall that our primal problem is to compute 

min'F((7) s.t. Wijhij{gij) = bi. Vi G V 

9 ^^ 

jeJv(i) 

The dual function is given by 

e(x) = mm{<F(5) - Xi[{ ^ Wijhij{gij)) - bi]} 

® iev jeN(i) 

where x are the dual variables, and as we will see correspond 
to the X in our original problem. (As a reminder, we only 
consider one of gij,gji as a variable, say gij and implicitly 
replace all occurrences of gjt with —gij-) The first order 
condition for this minimization when differentiating by gij 
is: 

0 ~ 4‘ij{9ij) ^j)'^ij^ij{9ij) 

Using the dehnition of cj)ij yields: 

0 = Wijhij{gij)gij — (xi — Xj)wijhij{gij) 

which reduces to gij = Xi — Xj. Plugging this into 0(®) 
yields 

Q(*) = ^ ^ 'Wijhij{Xi-Xj))-bi] 

iGV j£N{i) 

To simplify this we define gij = Xi — Xj and use the b- 
feasibility of gij to write this as 

~ y 4'ij{9ij) ~ y ] '^ij9ij{hij{gij) — hij{gij)). 

Using this we can approximate the distance to optimality 
since $(g) > 0(a;) for any b-feasible g and $( 5 *) = 0(a:*) 
where g*,x* are the optimizers of their respective problems. 
Thus, following Kelner et. al if we define 

GAP{g) = $(3) - $(3*) 

then for any b-feasible g we have 

GAPig) < $(g) - 0(a;) 

We can use the tree values for x to provide this estimate 
yielding 

GAP{g) < TGAP = TGAPij(s) 

(i,j)eE\T 

where we define TGAP by 

TGAPi3(sr) = <t>ij{gij) - (f)ij{xi - Xj) 

P Wijixi - Xj){hij{xi -Xj)- hij{gij)) 

where the sums in the gap formula need only be over non¬ 
tree edges since Xi — Xj = gij for all {i,j) G T. 

Lemma 4. Given a b-feasible g, 

GAP{g)< 

(i,j)eE\T 

where Gij is the tree cycle passing through edge (i,j)- 


Proof. Since GAP < TGAP we can focus on TGAP, 

9TGAPij(5r) _,// , , - - \u' i \ 

ogij 

substituting for (f>'ij gives 

gTGAP^J (g) ^ yj (^g _ _ xj))h'ij{gij). 

ogij 

Integrating yields 

f9ij 

TGAPij(gi) < / 'Wij{sij - (xi — Xj))k dsij 

J (Xi —Xj ) 

which after noting that Gcij = (gij — {xi — Xj)) yields 
TGAPij(5) < WijG%..k 
which implies the result after summing. □ 

7. TREE CONDITION NUMBERS 

In order to guarantee convergence in the linear case, Kelner 
et. al. [3] require that the underlying spanning tree have 
nice properties. As we will show the same condition suffices 
for the nonlinear case as well. 

Dehne the tree condition number of spanning tree T to be 
r(r) = ^ w.j/WCi, 

(i,j)eE\T 

where Gij is the tree cycle over edge {i, j). Now, as discussed 
in Kelner et. al, the tree condition number is closely related 
to the stretch of a tree, a well studied topic. The stretch of 
a tree T is dehned to be st{T) = Y))(i where 

St{iJ)=Wij ^ 1/Wrs 

and it is easy to show that r(T) = st{T) p m — 2n p 2. 

Thus, we can use the following result due to Abraham et. al 
[1] to construct a tree with a good condition number 

Theorem 2 (Abraham et. al pQ). /n 0(m log n log log n) 
time we can compute a spanning tree T with total stretch 
st{T) = 0(m log n log log n). 

8. LINEARIZATION 

For static analyses we can linearize the Laplacian at some g 
by considering the adjusted weights, «)?■ = Wijhij{gij)/gij, 
creating a new Laplacian L®. Then we can write 

j€N(i) 

Next we define the linearized energy to be the energy of the 
linearized Laplacian: 

C'(5)= E 45^/2 

(i,3)eE 

and it is easy to see that ^®(g) = ^{g)- Then, using the 
bounds on h'ij{gij) we see that 

'^ij £ [il/k)wij,kwij], 




and 


$(5) G [{yk)e{9),kiHg)] 
H9)M9*) > (l/fc")e®(5)/C^(5*). 


Proof. Lemma 6.1 in [3] shows that < k^^{g*)st{T). 
The result follows from the fact that 

H9)&[Wk)iH9),kiH9)]- 

a 


11. ALGORITHM AND MAIN RESULT 

We now present the algorithm which is a slight modification 
of that in Kelner et. al [3]. 

Algorithm: Simple Nonlinear Solver 


This will allow us to use results from the linear case to an¬ 
alyze some static properties of our algorithm. 

9. ACCURACY 

In the linear case, most spectral methods consider the ac¬ 
curacy in solution [3]. Thus we will say that a solution is 
e-accurate if 



where is the linearization of L at q*, as used in the pre¬ 
vious section. By combining our bounds on the linearization 
of the energy function (***) with Lemma 6.2 in [3] for the 
linearization we can easily prove the following: 

Lemma 5. If 

H9)/H9*)<i + ^V{riT)k^) 

then 

11^‘lli* 

where x is defined by its tree values on T. 

Proof. It is easy to see from Lemma 6.2 in [3] that if 
^(5)/^’(s*) < + then \\x - x*\\ig < e/{k^T{T)). 

To complete the proof we need the following lemma relating 
ll^’llis and ||u||ig.. □ 

Lemma 6. ||u||£g < • 

Proof. This follows from the definition: 

(ij)eB 

and noting that w?. G [{l/k)wij, kwij]. □ 

10. INITIALIZATION 

To initiate our algorithm we start with a b-feasible flow g° 
with support only on the spanning tree. This can be com¬ 
puted recursively with 0{n) operations by noting that for 
any leaf node i of the tree the value of gij for j = N{i) is 
given by Wijhij{gij) = bi. 

However, it will be useful to note that this can be computed 
by first computing the solution for the linearization of L 
denoted g^, where we set all hij{-) to be identity functions, 
using the above recursive algorithm. Then convert ^ to g^ 
using the relationship gfj — h~I{g°j). 

The following bound on the initial energy follows directly 
from Lemma 6.1 in [3]. 

Lemma 7. $( 5°) < P^{g*)st{T) 


1. INPUT: Input G = iV,E,w), e G K+, A: > 1, hij{-) V{i,j) G 

E. 

2. OUTPUT: g G a; G K'". 

3. Construct T: a low stretch spanning tree of G. 

4. Construct g°: a b-feasible solution with support only 
on the spanning tree. 

5. Set probabilities pij = Wij/{WciiT{T)) for all {i,j) G 
E\T. 

6 . Set S= r2fcV(T)log(fc®st(r)r(T)/e2)]. 

7. For £ = 1 to S: 

(a) Pick a random {i,j) £ E\T with probability pij. 

(b) Apply a cycle update to g^~^ using tree cycle 
Cij to get g^. Compute this to relative accuracy 
l/(2fc^) using binary search. 

8 . RETURN g and its tree induced x. 


In order to show the correctness of this algorithm, we ex¬ 
tend the analysis in [3]. Since the GAP(g^) conditional on 
GAP{g^~^) is a random variable we focus on expected val¬ 
ues. 

Lemma 8. For each iteration in the loop of “Algorithm: 

Simple Nonlinear Solver” 

E[GAP{g^)\ - E[GAP{g^+^)\GAP{g^)\ ^ 1 

E[GAP{g<^)] - 2k^T{T)' 

Proof. First note that the change in GAP is the same 
as the change in 'l’(g) which is greater than WcijG^.^ /{2k) 
for an update of Gij , thus the expected decrease is 

E[GAP{/)]-E[GAP{g‘+^)\GAP{g‘)] > ^ pi,Wc^^Gl^J{2k). 

(i,j)eE\T 

Substituting in pij = Wij/{T{T)Wcij) yields 

E[GAP{/)]-E[GAP{/+^)\GAP{/)] > ^ {wij/T{T))Gl^J{2k) 

(i,j)eE\T 

using Lemma 0] for the RHS completes the proof. □ 

Using this we can prove our main theorem. 

Theorem 3. Given a nonlinear graph Laplacian L, “Al¬ 
gorithm: Simple Nonlinear Solver”, computes an expected 
e-accurate solution of Lx = b in 0{mnk^ log(fcn/e)) time. 






Proof. Rewriting the expression in Lemma[5]shows that 
in order to attain e-accuracy we need 

-!.(/) - $(5‘) < eV(r(r)fc")$(/). 

Iterating Lemma jS] implies that 

- -^>(< 7 *) < (1 - - $( 5 *)). 

Combining this with Lemma Q on the initial energy yields 

<E>(/) - -^>(5*) < (1 - 

Plugging in S'= [2fe^r(r) log(fc®st(r)T(T)/e^)] satishes the 
requirement, proving the accuracy of the Algorithm. By 
Theorem [2] 

st{T) — 0(m log n log log n) = 0{m) 
the required number of iterations is given by 
r = 0(fc^m log(fcn/e)). 

Since each iteration takes 0{n) time by Lemma|3]we get the 
stated result. □ 

Note that instead of an expected e-accurate solution, one 
can compute an actual e-accurate solution with high proba¬ 
bility with the same time bound using standard probabilistic 
techniques. 

12. CONCLUDING REMARKS 

This paper has shown that one can extend some of the tech¬ 
niques from the linear theory of systems of equations arising 
from linear graph Laplacians to ones with nonlinear graph 
Laplacians. We view our analysis as suggestive that even 
more general extensions, perhaps even asymmetric ones re¬ 
lated to directed graphs, may be possible, extending the 
range of spectral analysis. Perhaps as in Kelner et. al [3] 
our algorithm has a representation as an alternating projec¬ 
tions algorithm, showing that nonlinear extensions are also 
possible in that setting. 

There are many open directions for improving our algorithm. 
First, note that we have not optimized the k dependence. 
Next, note that one could use the ideas of tree scaling and 
a randomized stopping time as in [3] to reduce the running 
time. In addition, one might be able to use algorithms for 
the linear problem to speed up the nonlinear algorithm, ei¬ 
ther by providing a warm start or by speeding up conver¬ 
gence near optimality when the linear problem might pro¬ 
vide a good approximation for the nonlinear one. 
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While this paper raises many open problems relating to non¬ 
linear spectral theory, the main problem with the current 
algorithm is the dependence on n due to the complexity of 
computing a nonlinear cycle update which unlike the linear 
version does not have a simple analytic solution which can be 
computed in logarithmic time using a special data structure 
for keeping track of the hows on the tree. Nonetheless, it 
might be possible to reduce this complexity either through 
a specially designed data structure for the spanning tree, 
or perhaps through the use of a spanning tree with small 
diameter. Also, the direct application of our algorithm to 
nonlinear graph problems might be fruitful. 




